Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1132720180160040040
Genomics & Informatics
2018 Volume.16 No. 4 p.40 ~ p.40
Opinion: Strategy of Semi-Automatically Annotating Full Text Corpus of Genomics & Informatics
Park Hyun-Seok

Abstract
There is a community need for an annotated corpus consisting of the full texts of biomedical journal articles. In response to community needs, a prototype version of full text corpus of Genomics & Informatics, called GNI version 1.0 has been recently published, with 499 annotated full text articles available as a corpus resource. However, GNI needs to be updated, as the texts were shallow-parsed, and annotated with several existing parsers. I list issues associated with upgrading annotations, and give opinion on methodology to develop next version of GNI corpus based on a semi-automatic strategy for more linguistically rich corpus annotation.
KEYWORD
biomedical text mining, corpus, text analytics
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI) KoreaMed